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Direct Inquiries to: 



J. W. Burgeson 
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Cleveland, Ohio 44113 



DECK KEY 



Three decks are included: 

1. SPS source deck 225 cards 
Sequence numbered in columns 1-4 

2. KWIC object deck 55 cards 
Sequence numbered in columns 76-80 

3. Sample data 24 cards 

Sequence numbered in columns 79-80 



Modifications or revisions to this program, as they occur, 
will be announced in the appropriate Catalog of Programs 
for IBM Data Processing Systems. When such an announce- 
ment occurs, users should order a complete new program 
from the Program Information Department. 
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PROGRAM BRIEF 



This program provides a method of indexing any set of titles. 
Each word Indexed appears in the context of the title. 

Titles may contain up to 600 characters plus a twelve character label . 
Up to 650 words may be specified which are not to be indexed. 
The program indexes each word in every title except for these words 
and words shorter than 3 characters. Any number of titles may be 
Indexed . 

One card is punched for each word indexed. Up to 60 characters of 
the title plus the 12 word label appear on each card. The output 
must be sorted alphabetically and 80-80 listed off line. 

KWIC is SPS - coded for any card 1620. 20K of memory Is used. 

Output is produced at punch speed (125 cpm) . The sample problem 
runs in about one minute . This program has been used to index 
several personal libraries. 

This program and its documentation were written by two IBM employees 
It was developed for a specific purpose and submitted for general 
distribution to interested parties in the hope that it might prove 
helpful to other members of the data processing community. The 
program and its documentation are essentially in the author's original 
form. IBM serves only as the distribution agency in supplying this 
program. Questions concerning the use of the program should be 
directed to the author's attention . 







DETAILED PROGRAM DESCRIPTION 

Part of the following section is taken from G.I. Manual E20-8091. 

"The simplest form of a quickly assembled index is an alphabetic listing 
of significant words from some store of information — for example, the 
type of index generally found at the back of a textbook . The simplicity 
of such an index is due to the fact that the reader is assumed to be 
familiar with the subject matter of the book. In dealing with documents 
on many subjects, however, the significance of individual words can 
be determined only by referring to the statement from which the word 
was taken." 

"This time-consuming reference can be avoided by listing the selected 
words (called keywords) together with the words surrounding them — 
that is, listing the keywords in context — because this reveals the 
specific sense in which the keyword has been used." 

"KWIC indexing can be adapted to the literature of any scientific 
discipline, as well as to such areas as correspondence files, files of 
internally generated memorandums, legal papers, procedure manuals, etc." 

"For a computer to select keywords when programmed for KWIC 
indexing, a word list must be stored in it to enable it to differentiate 
between significant words (that is, keywords) and nonsignificant words. 
To establish such a word list, keywords need only be defined as those 
which characterize a subject more than others. Since significance is 
difficult to predict, it is more practical simply to reject all obviously 
nonsignificant words, such as articles, conjunctions, prepositions, 
auxiliary verbs, certain adjectives, and words such as "report", 
"analysis" and "theory." In addition, words such as "chemical" in a 
listing of chemical titles, or "law" in a listing of legal titles, would be 
nonr.ignif leant. When establishing a word list in this manner, there is a risk 
of admitting words of questionable significance. These may either be caught 
later through statistical analysis of frequency, or simply be tolerated. 
It is the task of the personnel in control of this word list to continually 
adjust it as required by the nature of the material being indexed and as 
dictated by user reaction. 

"The first step in setting up a KWIC indexing system is to prepare the 
list of nonsignificant words. This list is recorded and maintained on 
punched cards and forms part of the input to the computer. 
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Creation of a title/location record for each document to be indexed is 
the next step. This information concerning each document is punched 
on up to 10 IBM cards. 

This KWIC program makes maximum use of available space by a 
technique called "wrap-around" . This term refers to the fact that 
some or all of the title may precede or follow — that is, wrap around — 
the keyword. 



Tin's k*»l* fvo}**** 



^-j^ostftvtt YttftAs f ©t&ei- 



foh FCaJ^///-^ y hi kicks' 

ci too***. ft+jHust u, t V/ 
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INPUT DESCRIPTION 



The input deck consists of two parts: words not to be indexed and 
title cards. Letters A-Z, numbers 0-9, and the hyphen (-) are 
considered parts of words. Any other punches are considered 
punctuation. 

Words Not To Be Indexed. These words are punched one to a card 
left j ustified in the first ten columns. The following 70 columns are 
ignored. Words longer than 10 characters are truncated to 10 
characters. 

If a word or the first part of a word in this list is the same as a word 

in a title, that word is not indexed. If the word in the title is 

longer than 10 letters, only the first ten are considered in this comparison. 

Some examples should clarify this. 



Words in list 
not to be indexed 



Words in title which 
will not be indexed 



Words in title which 
will be indexed 



COMPUTATION 



COMPUTATIONS 
COMPUTATION 



COMPUTER 



FORECASTS 



FORECASTS 
FORECAST 
FORE 
FOR 



FORECASTING 
FOREWARNS 



ADDRESS 



ADDRESS 
ADD 



ADDRESSES 



Title Cards 

The title may be punched anywhere in the first 60 columns of a title 
card. Up to ten cards may be used for one title. Column 1 of a 
continuing title card is assumed to follow immediately after column 60 
of the preceding card. Superfluous blanks are removed from the title 
by the program. Columns 61-72 of the first card of a title must 
contain a label identifying the title. If the title is longer than one 
card, this description field on succeeding cards may either be left 
blank or may be an exact duplication of the label of the first card of 
the title. 



SUMMARY 



WORDS NOT TO BE INDEXED 



Columns 1-10 Word left justified in field 

11-80 Ignored 

■ TITLE CARDS 



Columns 1-60 Title Anywhere in field 

61-72 Label 
73-80 Ignored 



Input Deck 



1. Words not to be indexed (at least one) 

2 . Blank card 

3. Title cards (any number of titles) 
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OUTPUT FORMAT 

The output format looks very similar to the Input title card. The 
end of each title line is marked by an asterisk (*) . The keyword 
being indexed is punched beginning in column 21, with as much 
of the title as possible, wrapped. around when necessary, around 
It from column 1 through column 60. Columns 61-62 are blank, 
the title location is punched in columns 63-74. Columns 75-80 are 
blank. 

An alphabetical sort of columns 25, 24, 23, 22, 21, in that 
order, will usually serve to establish a sufficient order for 
listing and subsequent retrieval. 
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Detailed Coding Information 



This program was assembled using SPS-1620/1710 for cards, 
1620-SP-020, Version 2, Mod 12. Standard SPS coding was 
used throughout. 

Clearing memory before program loading is a necessity, otherwise 
the flags in the input area will not be initialized properly. The 
"last card" switch (09}_is used for test for end of Job. If this 
point is reached prematurely due to running the cards out with 
more to follow, restart from the beginning with all cards except 
those already indexed. 



Storage Map 

00000-00399 arithmetic tables 

00402-02641 program 

02642-18460 constants, work area 

18461-16999 not used 
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Operating Procedures 



Sense switches are not used. Set all other switches to stop. 

1 . Clear memory (not optional) 

2. Load KWIC indexing object program, (deck 2) 

3. Place input deck in read hopper. Press reader start. 
A. Ready the punch. 

5. Press start key on 1620 console. 

6. When all titles have been indexed the program halts (43 In 
op code register) 

7. Pressing the. start key on the 1620 console reinitializes 
and restarts the program. 

Error Messages 

"MORE THAN 650 WORDS NOT TO BE INDEXED." After this message 
Is typed the program halts. Pressing the start key reinitializes 
and restarts the program. The error is probably due to forgetting 
the blank card between the two parts of the input deck . 

"MORE THAN 10 TITLE CARDS". Program halts after this message is 
typed. Pressing start restarts the program to read a new title. 
This error may be due to the presence of blank cards among the title 
cards or forgetting to punch a label in the first card of a title. 



Note : This program requires that memory be cleared before loading, 
as the input and output areas are not cleared by the program. Failure 
to observe this restriction will probably result in a hung program 
or spurious output. 

Note: to clear memory - "instant stop" 
"reset" 
"insert" 

type 160001000000 

"release" 
"start" 

after about 3 seconds, 
"instant stop" 
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